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Characterization of the set of entropy functions F* is an important open problem in information 
theory. The region T* is central to the theory of information inequalities, and as such could be regarded 
as a key to the basic laws of information theory. Characterization of F* has several important conse- 
quences. In probabihty theory, it would provide a solution for the implication problem of conditional 
independence. In communications networks, the capacity region of multi-source network coding is 
given in terms of F*. More broadly, determination of F* would have an impact on converse theorems 
for multi-terminal problems in information theory. This paper provides several new dualities between 
entropy functions and network codes. Given a function g > defined on all proper subsets of A'' 
random variables, we provide a construction for a network multicast problem which is "solvable" if and 
only if g is the entropy function of a set of quasi-uniform random variables. The underlying network 
topology is fixed and the multicast problem depends on g only through link capacities and source rates. 
A corresponding duality is developed for linear networks codes, where the constructed multicast problem 
is linearly solvable if and only if g is linear group characterizable. Relaxing the requirement that the 
domain of g be subsets of random variables, we obtain a similar duality between polymatroids and the 
linear programming bound. These duality results provide an altemative proof of the insufficiency of 
linear (and abelian) network codes, and demonstrate the utility of non-Shannon inequahties to tighten 
outer bounds on network coding capacity regions. 
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I. Introduction 

Information inequalities are one of the central tools of information theory. An information 
inequality is a relation between information measures such as entropy and mutual information 
that holds regardless of the specific choice of joint probability distribution on the underlying 
random variables, see [1, Chapters 12-14]. Converse proofs involving chains of information 
inequalities are ubiquitous in the literature, extending back to Shannon. It is somewhat frustrating 
therefore, that a characterization of the complete set of information inequalities is lacking. Until 
the appearance of the Zhang- Yeung inequality [2], the only known inequalities were the so- 
called Shannon, or basic inequalities, being consequences of the non-negativity of conditional 
mutual information (which is a special case of non-negativity of information divergence). Starting 
with [3], large classes of conditional non-Shannon inequalities (e.g. contingent on imposition of 
certain Markov constraints) have been found [4]-[7]. A countably infinite class of unconstrained 
inequalities was reported in [8], indexed by the number of random variables N involved (one 
inequality for each A^). More recently, additional unconstrained non-Shannon inequalities have 
been found [9]. Another countably infinite class of unconditional inequalities was recently found 
in [10]. This class differs from [8], in that a countably infinite number of inequaUties were found 
for any fixed number of > 4 random variables. As we shall see later, this result has profound 
implications. 

An intimately related concept is the set of entropy functions T*. Let 7Y[£] be a subset of a 
2^ dimensional euclidean space. Each coordinate of this space will be indexed by a subset of 
a set jC with N elements. Points h eH[jC] can be regarded as functions, mapping from the set 
of all subsets of C onto R with /i(0) = 0. Points in HlC] belong to F* if they correspond to 
a consistent choice of joint entropies for a set £ = {Xi,X2, . . . , X^} of N random variables. 
Members of F* are called entropic, and members of the closure of F*, denoted by F*, are called 
almost entropic. 

Characterization of f * is equivalent to determination of the set of all possible information 
inequalities [1, Section 12.3]. This characterization is lacking for N > "i. In contrast, we do 
know the set F D F* corresponding to the basic inequalities. This set contains some functions 
that obey the basic inequalities, but are not entropy functions and do not correspond to any joint 
distribution on N random variables. The basic inequalities are equivalent to the polymatroid 
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axioms, and hence T is simply the set of polymatroids, implying a polyhedral structure. 

Characterization of T* is an important open problem. It gives bounds for source coding prob- 
lems [11]. As shown in [1], it would resolve the implication problem of conditional independence 
(determination of all additional conditional independence relations implied by a given set of 
conditional independence relationships). In other fields, information inequalities are also closely 
linked to group theory [12] and the theory of Kolmogorov complexity [13], [14]. The focus in this 
paper is however on the link between entropy functions and the capacity region of multi-source 
network coding. 

The prevailing approach to data transport in communications networks is based on routing, in 
which intermediate nodes duplicate and forward packets towards their final destination. Although 
such a store-and-forward scheme is simple to implement, it does not guarantee efficient utilization 
of available transmission capacity. The network coding approach introduced in [15], [16] general- 
izes routing by allowing intermediate nodes to forward packets that are coded combinations of all 
received data packets. This seemingly simple change in approach yields many benefits. Not only 
can network coding increase throughput in multicast scenarios, it can also provide robustness 
to link failure [17], wiretap security [18], and minimal transmission cost [19]. Naturally, these 
advantages are obtained at the expense of increased node complexity. 

One fundamental problem in network coding is to understand the capacity region and the 
classes of codes that achieve capacity. In the single session multicast scenario, the problem is 
well understood. In particular, the capacity region is characterized by max-flow/min-cut bounds 
and linear network codes are sufficient to achieve maximal throughput [16], [20]. 

Significant practical and theoretical complications arise in more general multicast scenarios, 
involving more than one session. It was recently proved that linear network codes are not 
sufficient for the multi-source problem [20]. Furthermore, the network coding capacity region is 
unknown. In fact, there are only a few tools in the literature for study the capacity region. 

One powerful theoretical tool bounds the capacity region by the intersection of a set of 
hyperplanes (specified by the network topology and connection requirement) and the set of 
entropy functions F* (inner bound), or its closure F* (outer bound) [1], [21], [22]. Recently, 
these bounds have been tightened to obtain an exact expression for the capacity region, again in 
terms of F* [23]. Unfortunately, the capacity region, or even the bounds cannot be computed in 
practice, due to the lack of an explicit characterization of the set of entropy functions for more 
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than three random variables. One way to resolve this difficulty is via relaxation of the bound, 
replacing the set of entropy functions with the set of polymatroids T. The resulting "linear 
programming" bound can be quite loose. Recent work [24] based on matroid theory showed that 
application of the Zhang- Yeung inequality [2] yields a tighter bound for the capacity region (by 
obtaining a better outer bound for the set of entropy functions). 

The main results of this paper are new dualities between non-negative functions g E HlC] and 
network codes. These duality results are based on the construction of a special network multicast 
problem from functions g. The underlying network topology is fixed and the multicast problem 
depends on g only through the assignment of link capacities and source rates. 

Three main kinds of duality are considered, corresponding to different restrictions on g and 
different kinds of network codes. First, we show in Theorem [T] that the constructed multicast 
problem is solvable (i.e. the constructed source rates and link capacities are in the capacity 
region) if and only if g is the entropy function of a set of quasi-uniform random variables. This 
duality is extended in Theorem [2] to show that the multicast problem is asymptotically solvable 
with e error if and only if /i is almost entropic. 

The second duality restricts attention to linear network codes. We show that the multicast 
problem is linearly solvable if and only if g is linear group characterizable (i.e. g is an entropy 
function for random variables generated by vector spaces). A corresponding limiting form of 
this duality is also provided. 

Finally, by relaxing the requirement that the domain of g be subsets of random variables, we 
obtain a duality between polymatroids and the linear programming bound. 

These duality results yield several immediate implications. In particular, we provide an al- 
ternative proof to [20], [24] for the insufficiency of linear (and abelian) network codes, and 
demonstrate the utility of non-Shannon inequalities to tighten outer bounds on network coding 
capacity regions. 

The paper is organized in the following way. Section In] introduces some fundamentals of 



network coding. Section II-A focuses on network codes with algebraic structure, and random 
variables generated by groups with a variety of algebraic structures. We establish a relation 
between linear network codes and random variables generated by vector spaces and generalize 
this idea to define the concept of a group network code. A central theme of the paper is the 
trade-off between source rate and link capacity using network coding, i.e. determination of 
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the network coding capacity region. Section |II-B| introduces the definitions for admissibility 



and achievability in the network coding context. Section III introduces the concept of pseudo- 
variables, which generalize random variables in such a way that allows a notational unification 
of the linear programming bound with that of [21]. 



Section IV proves the duality results, Theorems [T]-|5j These results rely on the construction in 



Section [TV-A of a special network and multicast problem from a function g. Section IV-B gives 



the duality between entropic functions and solvable multicast problems. Section IV-C provides the 



corresponding duality for linearly solvable multicast problems. These duality results are extended 



in Section IV-D to give a similar link between polymatroids and the linear programming bound, 
i.e. a function g is a polymatroid if and only if the constructed source rates and link capacities 
satisfy the bound. This result relies heavily on the notion of pseudo-variables introduced in 
Section |III} and in particular on extension and adhesion of sets of pseudo- variables, discussed 
in Appendix |I} Finally, in Section IV-E we give a one-way relation between the LP bound for 



linear codes, and polymatroids which also satisfy the Ingleton inequality. 

Section |V] explores the implications of our results, which include the insufficiency of linear 
or even (abelian) group network codes, and the necessity for non-Shannon inequalities for 
determination of the network coding capacity region. 

Notation: For a set A, the power set 2-^ = {B : B C A} denotes the set of all subsets of A. 
Given a set of |^| variables {Xa, a E A}, and a subset CCA, the subscript Xc shall mean 
{Xc : c G C}. In contrast, the notation Y[g] will be used to index a single variable out of a set 
of 21-^1 variables {Yjej : B E 2-^}. Other notation will be introduced as necessary throughout the 
paper. 

II. Networks, Codes and Capacity 

A directed acyclic graph Q = (V, S) is commonly used as a simplified model of a commu- 
nication network. The nodes u E V and directed edges e = (tail(e), head(e)) E £ respectively 
model communication nodes and directed, error-free point-to-point communication links. The 
terms graph and network will be used interchangeably. For edges e, f E £, write / ^ e as 
shorthand for head(/) = tail(e). Similarly, for an edge f E £ and a node u E V, the notations 
f ^ u and u ^ f respectively denote head(/) = u and tail(/) = u. So far we have only 
specified the basic network topology. The communication problem is specified via imposition 
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of a connection requirement. 

Definition 1 (Connection Requirement): For any network Q, a connection requirement M = 
{S,0,V) is specified by three components representing the sessions, originating nodes and 
destination nodes as follows. S is an index set of independent multicast sessions, each of which 
is a collection, or stream of data packets to be multicast to a prescribed set of destination nodes. 
O : S V a. source-location mapping, where 0{s) is the originating node for multicast 
session s. V : S ^ 2^ h & receiver-location mapping, where V[s) C P is the set of nodes 
requiring the data of session s. 

It should be noted that there is no specified rate requirement. The connection requirement 
differs from the usual concept of multicast requirement in that it only specifies which nodes 
require data from which other nodes, and not any particular desired information rate. 

Given a connection requirement M, the goal of a network code is to efficiently multicast data 
for session s originating at node 0{s) to all receivers in the set X>(s). Nodes are assumed to 
have sufficient computing power to implement any desired network coding scheme. 

Let T = S\J E. For a network Q and connection requirement Af , a network code is specified 
by a set of source and edge alphabets {Uf, f e and a set of local coding functions 



where for ease of notation, s — > e indicates 0{s) — > e, and f E T : f ^ e means any source or 
edge incident to edge e. 

Data transmission takes place as follows. Session s E S generates a source symbol Ug, which 
is assumed to be independent of other sessions and uniformly distributed over Us. The link 
symbol transmitted along e E E is Ug — (f)e{Uf : f E f ^ e). In other words, the symbol 
transmitted along an outgoing link of a node is a function of the available sources and incident 
Unk symbols. 

We will refer to a network code by with the set of alphabets {Uf, f e J^} implicitly defined. 
Since the input and link symbols are random variables, we can also refer to the code by the set 
of random variables Ujr, where their joint distribution is implied by $. Clearly, 




H{Us) = ^(^-) = E log l^^l 



s£S ses 



H{U,)<\og\U, 
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For a given network code $ designed for a network Q with connection requirement M, the 
error probability Pe{^) is defined as the probability that at least one receiver d E [Jses'^i'^) 
fails to correctly reconstruct one or more of its requested source messages {Us ■ '^{s) = d}. A 
zero-error network code is one for which Pe{^) = 0, implying that the source symbols Ug are 
deterministic functions of the corresponding receiver- incident edge symbols. 

A. Algebraic network codes 

The above formulation imposes no restriction on the choice of alphabets and local coding 
functions. However, in practice, it may be preferable to impose algebraic structure to reduce 
the complexity of encoding and decoding. The overwhelming majority of codes studied for the 
point-to-point channel are in fact linear, and linear codes are also of particular interest in the 
network coding context. 

Definition 2 (Linear Network Code): A network code $ is linear over a finite field ¥g if 
all source and link alphabets Uj are vector spaces over some finite field F^, and all the local 
encoding functions 0e are linear. 

Clearly, for a linear network code, each source alphabet is a vector subspace and the symbol 
transmitted along link e E S is a linear function of the inputs U^. As will be stated in 
Proposition |2} the set of all the kernels of these linear functions associated with all the links can 
be used to "construct" the set of source and link random variables defining the network code. To 
understand this relationship, we first review the construction of random variables from a finite 
group and its groups [12]. 

Definition 3 (Construction of random variables from subgroups): Suppose that f/ is a ran- 
dom variable uniformly distributed over a group G. For any subgroup Gi, the set of left cosets 
of Gi forms a partition in G. Let Ui be an index set of the cosets of Gi in G. We can define a 
random variable Ui as a function of U such that Ui is the index of the coset of Gi that contains 
U, or simply that Ui is the coset of Gi that contains U. The resulting random variable is said 
to be constructed from G and G^. 

Definition 4 (Group characterizable random variables): A set of random variables {f/i, . . . , Un} 
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random variables constructed from a finite group G and its subgroups Gi, ■ ■ ■ , Gat. 

If G is abelian, then {f/i, ■ ■ ■ ,Un} (and the entropy function) is called abelian group charac- 
terizable. If in addition G and Gi, ■ ■ ■ , Gm are all vector spaces, then the set of random variables 
(and the entropy function) is called linear group characterizable. 

Denote the set of group characterizable entropy functions by Vq C F*, the set of abelian 
group characterizable functions by r*^^ and the set of linear (with respect to a finite field Fg) 
group characterizable functions by T^^). Then, it is clear that F^^^^ C r*jj C C F*. 

Random variables constructed from subgroups have been shown to have many interesting 
properties. For example, suppose {f/i, ■ ■ ■ .U^} is constructed from a finite group G and its 
subgroups Gi, ■ ■ ■ , Gn- Then H (Ua) = log |G|/| f].g^ Gi\ for any non-empty subset a C Af = 
{1, 2, . . . , A^} [12]. It was also proved in [12] that a linear information inequality is valid if and 
only it is satisfied by all group characterizable random variables. Thus group characterizable 
random variables have an interesting role to play in the proof of information inequalities. 

Before describing some additional properties of group characterizable random variables, we 
will need the concept of quasi-uniform random variables. 

Definition 5 (Quasi-uniform random variable): A discrete finite random variable U defined 
on a sample space U is called quasi-uniform if and only if it is uniformly distributed over its 
support il{U). In other words, the probability distribution of U has the following form: 



Hence, H{U) = log|fi(f/)|. 

Similarly, a set of random variables Ui,U2, ■ ■ ■ ,Un- (and its induced entropy function) is 
called quasi-uniform if and only if every subset of random variables Ua,a C {1,2,..., A^} is 
quasi-uniform, i.e. H{Ua) = \og\il{Ua)\- 

'Two sets of random variables {Ui, - ■ ■ ,Un} and {Vi, • • • , Vn} with probability distributions Pu and Py respectively are 
"equivalent" if for each i = 1, ■ • ■ , A'^, there is a one-to-one mapping Ti from the support of Ui to the support of Vi such 
that Pu{Ui, ■ ■ ■ , Un) = Pv{ti{Ui), ■ ■ ■ ,tn{Un))- In this paper, two sets of equivalent random variables will be regarded as 
identical. 





Otherwise 
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Lemma 1 ( [12], [25]): Random variables induced by groups and subgroups are quasi-uniform. 
Hence 

r* c r:, c c c r* 

where Fg is the set of all quasi-uniform entropy functions. 



w 



Fig. 1. The side-information network. 



Lemma 2: With reference to Figure [T] consider a simple coding problem in which there is a 
transmitter (indicated by an open circle) and a receiver (indicated by a double circle) connected 
by a noiseless point-to-point link. A source f/i is available at the transmitter, while correlated 
side-information f/2 is available at both transmitter and receiver. The coding problem is to encode 
t/i, U2 into a symbol W defined on the sample space W such that Ui can be constructed perfectly 
at receiver from W and U2- 

Suppose that {f/i,f/2} is quasi-uniform. Then one can have a zero-error code with rate 
log \Vl{Ui, U2)\/\Vl{U2)\ = H{Ui\U2), where the code rate is defined as log |W|. 

Proof: Since U2 is available to both transmitter and receiver, Ui can be reconstructed 
perfectly if the transmitter only sends the index of ui in the set {ui : (^1,^2) G ^l{Ui,U2)} 
for any given U2 E ^1{U2). By the quasi-uniformity of {Ui,U2}, the cardinality of the set 
{ui : {ui,U2) G il{Ui,U2)} is |fi(f/i, t/2) 1/1^(^^2)1 for any U2 E ^{U2). Hence, one can easily 
construct a zero-error code at a rate of \og\il{Ui,U2)\/\^{U2)\ = H{Ui\U2) that solves the 
coding problem. ■ 

If the group and subgroups in question possess additional algebraic properties, the induced 
random variables may also satisfy certain additional properties. One interesting example, proved 
in [26], [27] is given as follows. 
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Proposition 1 (Ingleton's inequality): Suppose that the set of random variables {Ui, . . . ,U]y} 
is abelian group characterizable. Let {Vi, V2, V3, V4} C {Ui, . . . ,Un}- Then 

(?(l,2)+^(l,3)+^(l,4) + (7(2,3) + (7(2,4) > (?(l) + (?(2) + (?(3,4) + (7(l,2,3) + ^(l,2,4) (1) 

where g{a) = H{Va). 

Proposition 2: Suppose that a set of random variables {Uf, f E JF} defines a zero-error linear 
network code. Then {Uf, f E J-'} is linear group characterizable. 

Proof: [Proof Sketch] Suppose that $ = {0e,e E £} is a zero-error linear network code 
with inputs Ug E Us for s E S and link symbols E tie for e E £. We will now construct a 
hnear group characterization for the set of source/link random variables induced by $. Let 

1) G be the vector space formed by the Cartesian product of Hse^^'^' 

2) -ips : G i-^ Us he a linear function such that tps{Us : s E S) = Us', 

3) tpe : G Ue he a linear function such that Ue = i'eiUs ■ s E S); (This is possible as all 
local coding functions (pe cire linear) 

4) Gf is the kernel of denoted by ker(^/), for f E SU£. Hence, Gj is a subspace of G. 
Then it is straightforward to show that for any (Us : s E S) and f E J-', the value of i'f(Us : 
s E S) can be uniquely determined from the index of the coset of that contains (Us : s E S) 
and vice versa. In other words, the link random variable Uj is equivalent to the one induced by 
the subspace Gf. ■ 

A natural interpretation of Proposition |2] is that linear network codes are those codes whose 
induced source and link random variables can be characterized by a vector space and its 
subspaces. Developing this line of thought more generally, we make the following definition. 

Definition 6 (Group network code): A group network code is a network code {Uf,f E J-"} 
whose source and link random variables are induced by a finite group G with subgroups Gf, f E 
T . Furthermore, a group network code is called abelian if G is abelian. 

For a group network code $ = {U f, f E J-'}, encoding at intermediate nodes works as follows. 
Suppose that the source and link random variables {Uf,f E J-'} are characterized by a finite 
group and its subgroups Gf for f E J-'. For any f E J-', let Uf he the index set for the set of left 
cosets of Gf in G. Each edge e receives symbols {Uf : / ^ e}, which are indexes of cosets 
Gf in G. The symbol Ue to be transmitted along edge e is the index of the left coset Ge that 
contains the intersection of the cosets of G/ indexed by {[//:/ ^ e}. 
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In fact, in the special case when the group and all its subgroups are vector spaces, we can 
index the coset of Ge as elements in a vector space such that Ue is indeed a linear function of 
{Uf.f^ e}. 

Example 1: An i?-module generalizes the concept of vector space, where the scalars are a 
members of a ring R, instead of a field. It consists of an abelian group K, and an operation of 
left multiplication by each element in R. In particular, for all r,s E R and g,h E K, 

rg e K 

{rs)g = r{sg) 
(r + s)g ^rg + sg 
r{g + h) = rg + rh 
Og = 0. 

R— module codes have been proposed as generalizations of linear network codes [20]. Messages 
to be transmitted along edges are elements in K. The only difference is that local encoding 
functions must be of the form 

where Vf^ e R. As such, there exists elements G R such that 

Let G be the |»S|-fold Cartesian product of K. For all e E £ and s E S, let 

Gs^{{UseK:seS):U,^0}. 

Then it is straightforward to show that Gf is an abelian subgroup of G for / G and that 
the source and link random variables induced by the R — module code is characterized by the 
subgroup G and its subgroups Gf, f E T. 
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B. The source rate-link capacity tradeoff 

So far, we have only considered networks, and codes designed to meet particular connection 
requirements. Typically however, each link has limited capacity, and a fundamental design 
consideration is the tradeoff between supportable network throughput and link capacities. Of 
primary interest is determination of the minimal link capacities u = {ue : e e 6) required to 
transmit sources over a network at given rates A = (A^ : s G S) such that all receivers can 
reconstruct their desired messages with no, or arbitrarily small probability of error. 

Definition 7 (Admissible rate-capacity tuple): Given a network Q — [V, £) and a connection 
requirement M, a rate-capacity tuple (A, a;) is admissible if there exists a zero-error network 
code $ = {Uf, f eSuS}, such that 

H{Ue) < log \Ue\<U}e, Ve G 

H{Us) ^ log \Us\>Xs: Vse<S, 

where Ue is the message symbol transmitted along link e and Ug is the input symbol generated 
at source s. 

Coding over long block of symbols often improves the rate of point-to-point codes. Similarly, 
increased efficiency may be expected for network codes operating over a long block of source 
symbols. Therefore, we also consider the asymptotic tradeoff between source rates and link 
capacities. 

Definition 8 (Asymptotically admissible): A rate-capacity tuple (A,cc;) is asymptotically ad- 
missible if there exists a sequence of zero-error network codes = {C/}"\ / e 5 U £:} and 
positive normalizing constants r{ri) such that 

lim -^H (t/f)) < lim log iWi'^)! < LUe, Ve e S, 
n-»oo r[n) n-»oo r[n) 

lim -^H = lim log > A,. Vs G S. 

n^oo r(n) n^oo r[n) 

The above two definitions consider zero-error network codes. Relaxing the requirement to 
allow arbitrarily small error probability prompts the following definition. 

Definition 9 (Achievable rate-capacity tuple): A rate-capacity tuple (A, cu) is achievable if 
there exists a sequence of network codes = {ui"'\f e S U £} and positive normalizing 
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constants r(n) such that 

lim -j—H < lim log < cUe, Ve e S, 

n^oo r(n) n->oo r(n) 

lim -^i/ (C/('^)) = lim log > A„ Vs e 5, 

n-^oo r[n) n^oo r(n) 

lim Pe ($^"^) = 0. 

ra— >oo 

Assuming that the underlying network and connection requirement are known impUcitly, the 
set of admissible, asymptotically admissible and achievable rate-capacity tuples will be denoted 
T'^, T°° and respectively. 

The preceding definitions place no restriction on the class of network codes under considera- 
tion. However, if a rate-capacity tuple is admissible/asymptotically admissible/achievable using a 
network code in a specific class C (e.g. the class of linear network codes), then that rate-capacity 
tuple is said to be admissible/asymptotically admissible/achievable by network codes in C, and 
the corresponding sets are denoted T°, Tg" and T^. 

In this paper, we are interested in two special classes of network codes, (i) linear network codes 
(with respect to an underlying finite field Fg) and (ii) abelian group network codes. The sets of 
admissible/asymptotically admissible/achievable rate-capacity tuples by linear network codes are 
respectively denoted by '^1(q)j'^T{q) '^Um)' Similarly, the set of admissible/asymptotically 
admissible/achievable rate-capacity tuples by abelian group network codes are respectively de- 
noted by T",, T- and T^^- 

Discovering the hidden structure of these sets of rate-capacity tuples is the key to understanding 
the tradeoff between source rates and edge capacities. In the following, we list some basic 
structural properties of T[i, and when C is either the class of all network codes, linear 
network codes or abelian group network codes. 

PI) The sets T2,Tg° and are closed under addition. In other words, if tuples (A, a;) and 
(A', a;') are in T° (or respectively in Tg° and T^), then the element- wise addition of the 
two tuples will still be in the same set. 

P2) and are closed convex cones, and con(T3) = where con(T°) is the minimal 
closed convex cone containing T°. 

P3) Admissibility implies asymptotic admissibility which further implies achievability, C 
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III. Pseudo-variables and bounds 

The sets of admissible/achievable rate-capacity tuples are difficult to characterize explicitly. In 
fact, we will show later that finding these sets is at least as hard as determining the set of entropy 
functions T*. Due to the difficulty of the problem, results on characterizing the set of achievable 
rate-capacity tuples are quite limited [21], [24], [28], [29]. While inner bounds and outer bounds 
constructed with entropic/almost entropic functions exist [1], these bounds are not computable 
and hence are of limited practical use. The only known computable outer bound is the Linear 
Programming (LP) bound, which is constructed using polymatroids [1]. The remainder of this 
section provides a brief review of these bounds. We use the opportunity to introduce notation 
(differing slightly from the original manuscripts), facilitating later discussion. 

Let £ be a nonempty finite set. Recall that 7i[£] (or simply H) is a real euclidean space which 
has 2l^l dimensions and coordinates indexed by the set of all subsets of C and that g{(/}) = for 
all g E 7Y[£]. Specifically, if g E H, then its coordinates will be denoted by {g{A) : A C C). 
We call C a ground set. Each g E H can also be viewed as a real-valued function g : 2^^ t-^ M. 
defined on each subset of C. 

Definition 10 (Polymatroid): A function (7 G 7Y[£] is a polymatroid if it satisfies 

(7(0) = (2) 

g{A) > g{B), if ;B C ^ non-decreasing (3) 

g{A) + g{l3) > g{AU B) + g{An B) submodular (4) 

Note ([2]) and ([3]) imply non-negativity of a polymatroid. Let £ be a set of discrete random 
variables with finite entropies. Note that C contains random variables rather than indexes for a 
set of random variables. This induces a function g E H where g{A) is the joint entropy of the 
set of random variables $ ^ A C C Functions so-defined will be called entropy functions. 

It is well-known that entropy functions are polymatroids over the ground set C. In fact, in 
the context of entropy functions, the polymatroid axioms are completely equivalent to the basic 
information inequalities (i.e. non-negativity of conditional mutual information) [1, p. 297]. It is 
by now well-known however that there are other information inequalities that are not implied by 
the polymatroid axioms. The set of entropy functions is denoted F*, while the set of polymatroids 
is F. 

While an entropy function takes a subset of random variables as argument, a polymatroid g 
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more generally takes a subset of the ground set L as argument, where the elements of L may or 
may not be random variables. For simplicity, we shall call the elements of the ground set of a 
polymatroid pseudo-variables. They differ from random variables in that they do not necessarily 
take values, and there may be no associated joint probability distribution function. 

It must be emphasized that pseudo-variables are only defined in the context of a polymatroid 
g defined on the ground set L. The elements of L are not pseudo-variables by themselves in the 
absence of an associated polymatroid. 

Carrying these ideas further, we will call g{A) the pseudo-entropy of the set of pseudo- 
variables A, and is a pseudo-entropy function. Treating pseudo-variables as a set of basic objects 
associated with a polymatroid yields notational simplification. For example, random variables 
are simply pseudo-variables possessing a probability distribution such that their pseudo-entropy 
function is the same as the entropy function. As such, we extend the use of to refer to 

the pseudo-entropy of a set of pseudo-variables A. 

Definition 11 (Entropic function): A set of pseudo- variables (and its associated pseudo-entropy 
function) is called entropic if its pseudo-entropy function is the same as an entropy function of 
a set of random variables. 

Similarly, a set of pseudo-variables (and their pseudo-entropy function) is called linear group 
characterizable if its pseudo-entropy function is the same as an entropy function of a set of 
linear group characterizable random variables. 

The following two definitions generalize concepts of functional dependence and independence 
to pseudo-variables. 

Definition 12 (Functional dependence): Let £ be a set of pseudo- variables. A pseudo-variable 
X e £ is said to be di function of a set of pseudo-variables AQ C if H {{X} VJ A) — H [A). 
This relation will be denoted by H{X\A) — 0. 

Definition 13 (Independence): Two subsets of pseudo-variables A and B are called indepen- 
dent if H(A UB) = H{A) + H{B), and this relationship will be denoted hy A -LB. Similarly, 
if HiU^^jAj) = EjejH{Aj), write ±jejAj. 

Clearly, these definitions are consistent with the usual ones used for random variables. The 
following bound re-states the Unear programming bound [1, Section 15.6] in terms of pseudo- 
variables. 

Definition 14 (LP bound): Given a network Q and a connection requirement M, the LP bound 



February 4, 2008 



DRAFT 



16 



is the set of rate-capacity tuples {X,uj) such that there exists a set of pseudo-variables {Us : s G 
S,Ue ■ e E S} satisfying the following "connection constraint": 

H{Ue\Uf:f~^e) = 0, ee£ 

H {Us \ Uf : f u) = 0, ue V{s) 

^ses Us (5) 

HiUs) >Xs, seS 

H{Ue) <uJe, e G £. 

Denote the set of rate-capacity tuples that satisfy the LP bound by T lp- From [1] it is known 
that T LP ^ T^. It is interesting to notice that the use of pseudo- variables gives a notational 
unification of an inner bound and an outer bound given in [1] as follows: 

Proposition 3 (Inner and Outer bounds): Given a network Q and a connection requirement 
M, let Tin rcsp. Tom be the set of rate-capacity tuples (A, uj) such that there exists a set of 
entropic resp. almost entropic pseudo-variables {Us : s E S,Ue : e E £} satisfying ([5]). Then 
Tin C T^ C Tom C Tlp. 

Proof: The proof is straightforward by rewriting the bounds obtained in [1]. ■ 

Similar to the LP bound, we define the following bound for abelian group network codes 
(including linear network codes) as follows. 

Definition 15 (LP-Ingleton bound): Given a network Q and a connection requirement M, the 
LP-Ingleton bound is the set of rate-capacity tuples (A, u) such that there exists a set of pseudo- 
variables {Us : s E S,Ue : e E £} satisfying the Ingleton inequalities ([T]) and the connection 
constraint ([5]). 

Proposition 4: Denote the set of rate-capacity tuples that satisfy the LP-Ingleton bound by 
'^LP,i- Then T^pj contains T^^. 

Proof: First notice that all source and link random variables of an abelian group network 
code must satisfy the Ingleton inequalities. The proposition then follows by using a similar 
argument as in [1] that proves T^p D T^. ■ 

Since the LP and LP-Ingleton bounds are defined by intersections of several linear half-spaces 
and hyperplanes, these bounds are polyhedral. Together with the following duality results, this 
implies that LP bounds are not generally tight (this is proved Section [V). 
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IV. Entropy functions, network codes and duality 

Given a network, a connection requirement and a rate-capacity tuple, the multicast problem 
is to determine whether or not the rate-capacity tuple is admissible or achievable (perhaps 
even restricted to codes in a particular class). In this section, we construct multicast problems 
from non-negative functions. This construction yields several dualities between properties of the 
generating function and the solubility of the multicast problem. We establish three main dualities. 
The first duality relates entropy functions and network codes. It can be paraphrased as follows. 

A function is quasi-uniform if and only if its induced rate-capacity tuple is admissible. 
This is shown in Theorem [T] Theorem [2] provides an extension which implies 

A function is almost entropic if and only if its induced rate-capacity tuple is achievable. 
The second duality proves similar results for linear network codes. 

An entropy function is linear group characterizable if and only if its induced rate- 
capacity tuple is admissible by linear network codes. 
This is Theorem |3] Again, Theorem |4] extends the result, relating almost linear group character- 
izable functions and achievable rate-capacity tuples with linear network codes. 

The third duality. Theorem [5] relates polymatroids and the linear programming bound. 
A function is a polymatroid if and only if its induced rate-capacity tuple satisfies the 
LP bound. 

We also give a partial result for an extension to polymatroids that also satisfy the Ingleton 
inequality. 

Despite their apparent simplicity, these results leads to many interesting corollaries: linear 
network codes (or more generally, abelian group network codes) are suboptimal, the LP bound is 
not tight, and in general the network coding capacity region is not a polytope. These consequences 
will be described in more detail in Section |Vl 

A. Constructing multicast problems 

Let h E 7Y[A/], be a given non-negative function over the ground set J\f = {1,2,. . . ,N}. 
The proof for the main result relies on the construction of a special network Q\ a connection 
requirement M^^ and a rate-capacity tuple T(/i) = {X{h),uj{h)). 
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Figure [2] defines tlie network topology, connection requirement and edge capacities. For 
convenience, the network is divided into several subnetworks. To differentiate the roles of network 
nodes, source nodes are indicated by open circles, destination nodes are double circles, and 
intermediate nodes are solid circles. By construction, each node takes only one role. The label 
beside a source node is the input message available to that source node (this defines the source 
location mapping O). The label beside a receiver node indicates the desired source message 
to be reconstructed at that destination node (this defines the destination location mapping V). 
To simplify notation, each capacitated edge is labeled with a pair of symbols denoting the 
edge message (and corresponding random variable), and the edge capacity. Unlabelled edges are 
assumed to be uncapacitated, or to have a finite but sufficiently large capacity (such as h(a)) 
to losslessly forward all received messages. 



The first part of the network, shown in Figure 2(a) contains the sources. There are 2^ — 1 
independent sessions, S = {S[a] : 7^ a C 2-^]|^ The desired source rate associated with session 
a is h{a). Singletons {i} E 2^ will be denoted without brackets, e.g. h{i) and S[i]. There are 

specific edge messages that are of particular interest. Rather than naming all edge variables 
f/e, e G we label these particular edge variables Vj, j = 1, . . . ,N. Remaining edge variables 
will be labelled with generic symbols W,W' ,W" ,W* and W**. Source Sij^] generates the 
network coded messages Vi, V2, . . . , which are duplicated as required and forwarded to the 
rest of the network. The remaining part of the network is divided into subnetworks of three 
types, shown in Figures 2(b)[ 2(c) | and 2(d) 



With reference to Figure 2(b)[ type subnetworks connect a single source to one receiver. 
There are 2^ — 1 type subnetworks, indexed by the choice of 7^ a G 2"^. 

Referring to Figure 2(c) there are 2^ — 1 type 1 subnetworks, one for each nonempty a G 2^. 



These subnetworks introduce an edge of capacity h(J\f ) — h{a) between source S^aT] and a sink 
requiring S'[^]. There is an intermediate node which has another \a\ incident edges (from Figure 



2(a)), carrying the messages Va = {Vj,j G a]. The intermediate node then has an edge of 
capacity h{Q) to the sink. 

Finally, Figure 2(d)| shows the structure of the type 2 subnetworks. Type 2 subnetworks are 



indexed by a set a, where 7^ a C A/" and an element i E a,i ^ J\f. Each type 2 subnetwork 

^For simplicity, we use the same symbol to denote the index of a multicast session and the associated source random variable. 
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connects two sources S[ct] and S[X] and two receivers respectively requiring S[a] and S[X]. In 
addition, there are |a| + 2 other incident edges from Part 1 of the network, carrying Va and two 
copies of Vi. For notational simplicity, we have written h(aU {i}) = h{a, i). 

So far, we have described a network Q\ a connection requirement M^^ and have assigned rates 
to sources and capacities to links. Clearly M^^ depends only on A^, and not in any other way on 
h. Similarly, the topology of the network depends only on A^. The choice of h affects only 
the source rates and edge capacities, which are collected into the rate-capacity tuple T(/i). Also, 
we can assume without loss of generality that T(/i) is a linear function of h. 

Example 2: Figure [s] shows the topology of the network Q'^ when N = 2. Edge labels are 
omitted for clarity. 




Fig. 3. The network Q' when N — 2. 

B. First Duality: Entropy functions and network codes 

Theorem 1: Let h be in 7i[A/] for = {1, 2, . . . , A^}. The induced rate-capacity tuple T{h) is 
admissible on the network and connection requirement M\ if and only if h is quasi-uniform, 
i.e., 

her^ ^ J{h) G TO. 
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We begin with a proof of the only-if statement, i.e. starting with the assumption of admissibility, 
we must demonstrate that the function is quasi-uniform. By Definition |7| admissibility of T(/i) on 
Q'' , M"^ requires existence of a zero-error network code $ with source messages S[a], ^ ^ a C J\f 
and a subset of its coded messages Vx satisfying 

H {S[c.]) >h{a), a CAT (6) 
H{S[^y.aCAf) = J2H{S[^-i) (7) 

H{Vi}<h{i), leM. (8) 

The remaining goal is to prove HiVa) = h(a) for every a C J\f. To this end, we prove the 
following series of Lemmas |3]-[8} each predicated on admissibility of T{h) on Q'' , MK 

Lemma 3: H {S[a]) = h{a) for all 7^ a C A/". 

Proof: Consider the type subnetworks of Figure [2(b) Admissibility implies that each re- 
ceiver can correctly reconstruct its required source message. This is not possible unless H{S[q]) < 
H{W) < h{a), which together with ^ proves the lemma. ■ 

Lemma 4: h{a) < H{Va) for all ^ « C TV. 



Proof: Consider type 1 subnetworks in Figure 2(c) In order for the receiver to correctly 
determine the requested source message S[X], it must be true that H{Va) + H{W) > H{S[X])- 
Furthermore, H(W) < h{Af) — h{a). Hence, 

H{Va) + h{U) - h{a) > H{V^) + H{W) 

> ^( Vi) 

where the last line follows from As a result, HiVa) > h{a). ■ 

Lemma 5: H{Vj) = h{j) for all j G A/". 

Proof: A direct consequence of Lemma |4] and ([8]). ■ 

By Lemma [5] we have taken a small step towards our goal, establishing HiVa) = h{a) for 
|a| = 1. Extension to all a will be achieved by induction on To this end, the remaining 
lemmas take the hypothesis H{Va) = h{a) for |a| = A; < A^, and are proved in the context of 
type 2 subnetworks indexed by a and an element i e J\f, i ^ a, as shown in Figure |2(d)[ 
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Lemma 6: In type 2 subnetworks, W ± Sya\ - Furthermore, if Va = h(a), then H(ya\W, S[a]) = 

0. 

Proof: By (|7]), 5'[q,] _L S[j\fi and hence 

('S'[a]) + H {S[j>/]) = H [Sla], Slf/]) 

<H{S[^^,S[M],W,W') 

<H{W,Sio.],W') 

= H{W,S[a]) + H{W' I W,S[a]) 

H{W,S[^]) + H{W') 

< H{W) + H{Si^]) + H{W') 

<^ h{a) + H{S[^]) + H{W') 

(iv) 

< h{a) + + h{U) - h{a) 

The inequality [i] follows from the fact that S[f^ is determined from W,S[a\,W at the upper 



receiver in Figure 2(d) Inequality {ii) is by discarding conditioning (note that both W and W 
depend on Syj^, so this is indeed only an inequality). Inequalities [iii) and {iv) follow from the 
type 2 subnetwork capacity constraints, 

H{W) < h{a) (9) 
H{W') < h{Af) - h{a) (10) 

and from Lemma [3} Finally, (t>) is by Lemma [3] Thus the series of inequalities is actually a 
series of identities, and as a result, 

H{W) = h{a) (11) 
H{W, ^[„]) = H{W) + H{Si^]) = 2h{a) (12) 
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which proves W ± S[a] ■ Now consider 

H{Va\W, 5h) = H{V^, W, S[^]) - H{W, S[^]) 

<H{V^) + H{Sio.])-H{W,Si^]) 

HiV^) - h{a) 
= if H{V^) = h{a) 



where (i) holds since is a function of Va, S[a] and (ii) is by (11 1 and (12). ■ 
Lemma 7: In type 2 subnetworks, H{W\Va,W*) = H{W\W*) = H(W), or equivalently, 
I{W;Va,W*) = 0. 

Proof: Recalling that i ^ a C M, 

H{W\Va,W*) > H{W\Va,W\Vi} 

H{W\Va, V,) + H{S^a]\Va, V, W) 

^ ^(%]) 
hia) 

(v) 

> H{W) 

> H{W\W*) 
>H{W\Va,W*) 

where (i) follows from the fact that W* is a function of Va, Vi, (ii) follows from that S[a] can 
be reconstructed at the lower receiver, and (Hi) follows from independence of S[a\ and (Va, V), 
since by (jv]) S[a] J- S[X] and all the Vj,j E Af depend only on S[X]. Finally, (iv) is by Lemma |3| 
(v) is by the capacity constraint ^ and the remaining inequalities simply add extra conditioning. 
Thus the chain of inequalities is actually a chain of identities, the last three proving the lemma. 



Febi-uary 4, 2008 



DRAFT 



24 



Lemma 8: In type 2 subnetworks, assuming = h{a), H{W*\Va) = H{Va\W*) = 0. 

Proof: 

H{V^\W*) = H{Va\W*, W) + I{Va; W\W*) 

= H{Va\W*,W) 

<H{Va,S[^]\W*,W) 

= H{Vo,\W*, W, + H{Si^]\W*, W) 

HiVa\W\W,S[^^) 
<H{V^\W,Si^]) 

where (i) follows from Lemma |7| (ii) is because S[a] can be reconstructed at the lower receiver, 
and (Hi) is by Lemma [61 assuming HiVa) = h(a). Since conditional entropies are non-negative 



H{Va\W*) = 0. (13) 

On the other hand, 

H{W*\V^) = H{W*,V^)-H{V^) 

= H{W*) + H{Va\W*) - H{Vo) 
< h{a) - h{a) = 



where the last inequality uses ( [T3| ), the type 2 subnetwork capacity bound H{W*) < h{a) and 
the assumption H{Va) = h{a). Non-negativity of conditional entropy yields H(W*\Va) = 0. ■ 
We are now ready to assemble the preceding lemmas into a proof for the only-if part of 
Theorem [Tj Proof: [Proof: only-if part of Theorem [T| The goal is to prove H{Va) = h{a) 
for all non-empty subsets a C J\f. This was already shown for |q;| = 1 in Lemma |5] Extension 
to all a will be achieved using induction. First, assume the hypothesis is true for all a C 
with 1 < \a\ < k < N . For any i E J\f and a C M such that i ^ a and |a| = k, consider 



the type 2 subnetwork of Figure 2(d) We must show that H{Va,Vi) = h{a U {i}) = h{a,i) 



By Lemma |4] we already know that H{Va,Vi} > h{a,i). Therefore it remains only to prove 
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HiVa,Vi) < h{a,i). Now 

H{V„V^) < H{V„V^,W*) 

^:^H{Vi,W*) 

< H{Vi,W*,W") 
= H{Vi, W") 

< H{Vi) + H{W") 

H{Vi) + h{a,i)-h{i) 
=■* h{i) + h{a, i) — h(i) 
= h{a, i) 

where (i) follows from Lemma [S] (which holds under the induction hypothesis), {ii) is due to 
the fact that W* is a function of W" , Vi and (Hi) is from the subnetwork 2 capacity bound 
H{W") < h{a,i) — h{i). Finally, [iv) is by Lemma|5] 

Up to this point, we have proved that h is the entropy function of a set of random variables 
{Vi, . . . , Vtv}- To show that h is indeed quasi-uniform, it suffices to prove that for any subset 
a of A/", the set of random variables Vo, is quasi-uniform. Since we have just showed that 
HiVa) = h{a), if the receiver in the type 1 subnetwork can decode S[_\f], then H(Va\W') = 
H{W'\Va) = 0. Hence, H{W') = h(a). Now according to the link capacity constraint, W is 
defined on an alphabet set of size 2'^^°'\ and W (and hence V^) must be quasi-uniform. ■ 

It remains to prove the "if" statement in the theorem, i.e. to show that quasi-uniform random 
variables imply admissibility. Proof: [Proof: if part of Theorem [T| It suffices to show that 
one can construct a network code (defined by input variables, and message variables) meeting 
the connection requirement subject to the individual capacity constraint on each link. 

The construction for the input variables is simple. For any 7^ a C A/", define S[a] to be a 
quasi-uniform random variable with entropy h{a). These input variables are also assumed to 
be independent. It remains to show that we can construct edge variables satisfying the capacity 
constraints, and which allow each receiver to reconstruct the requested messages perfectly. 

By the quasi-uniformity of S[a], it is clear that all receivers in type subnetworks can 
reconstruct their requested message simply by having the source transmit the uncoded message. 
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Let {Vj : j E M} be a set of quasi-uniform random variables whose entropy function is h. 
Since H{Vj^) = H{S[j^), there is a one-to-one mapping between ^{V^f) and r2(S'[Ar])- As they 
are both quasi-uniform, S[j\f] and {Vj : j E J\f) can be regarded as the same. 

For type 1 networks, by quasi-uniformity of Va, one can send Va unencoded as W. Then 
the receivers see Va and an auxiliary message W defined on a sample space of size at most 
2h(M)-h{a) _ Reconstructing S\f^ at the receiver is equivalent to reconstructing the receiver. 

By the quasi-uniformity of S[a\ and Lemma |2| Vj^\a can be compressed to a symbol W of 
size 2''(-^)^^(") such that VV\a can be losslessly reconstructed from W and Va. 

It remains to verify that receivers in type 2 subnetworks can reconstruct all requested messages. 
Recall that both S^a] and Va are quasi-uniform. Assume without loss of generality that their 
supports are {0, 1, 2, . . . , 2'*(") - 1}. Then we can define W = Va + S[a] mod 2^'-'^'^\ It is easy 
to verify the following properties: 

H {W I S^a{) = H [S^a] I W, Va) = H {Va \ W, S^a{) = 0, (14) 

\og\n{W)\ = h{a). (15) 



By ( [141 ), the upper receiver can correctly reconstruct Va from S[a] and W . Using a similar 
compression scheme as used in type 1 subnetworks, source S[j^^ is compressed to h{M) — h{a) 
bits, allowing lossless reconstruction of S[j\j-] at the upper receiver. 

On the other hand, it is easy to see that {V^, V^j} is quasi-uniform. Hence Va can be compressed 
into W" with a support of size |i7(W^")| = 2'*("'*)^''(*) such that Va can be reconstructed by using 
W" and Vi. As a result, W* may be transmitted as Va without any encoding. The lower receiver 
can then recover S[a] from Va and W. 

Since all receivers can reconstruct their requested source messages with properly constructed 
message random variables satisfying the capacity constraints, the rate-capacity tuple T{h) is 
admissible. ■ 

Definition 16: A polymatroid h is called almost entropic if there exists a sequence of entropic 
pseudo-entropy functions h^'^^ and positive constants r{k) such that limfe^oo ^^^V'^(^) = ^• 

As r* is a closed and convex cone [30], the set of all almost entropic functions is V* . Theorem 
[T] establishes a duality, or equivalence between the quasi-uniformity of h and admissibility of 
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T{h). The following theorem extends this result to a duality between almost entropic h and 
asymptotically admissible (and achievable) T{h). 

Theorem 2: Let h E 7i[A/] for J\f = {1,2,..., A^} and let T{h) be an induced rate-capacity 
tuple. Then we have, 

hef* ^ T{h) G T°° ^ J{h) e T^ 

In other words, the rate-capacity tuple T{h) is asymptotically admissible (or achievable) on the 
network Q"^ and connection requirement M'^ if and only if h is almost entropic. 

Proof: Suppose that h is almost entropic. We will first show that T{h) E T°°. By [12], 
[26], one can construct a sequence of quasi-uniform entropic functions h^^^ and normalizing 
constants r{n) that lim„^oo = h{a). By Theorem [T| each T(/i*^"^) is admissible. 



By property P2 the set of asymptotically admissible rate-capacity tuples is a closed and 
convex cone and hence T(/i) G T°°. 

Clearly, T(/i) G T°° implies that T(h) G T^. It remains to show that T{h) is achievable 
implying that h is almost entropic. Suppose that T(/i) G T^ According to Definition [9| one 
can construct a sequence of normalizing constants r{n) and network codes with source 
messages {S^l^^,a C A/"} and edge messages V^^"* such thaj^ 

lim -^H (v}"'A < h{i) (17) 



lim Pe ($^"^) = 0. (18) 

n— >oo 

For each value of the sequence index n, consider the network and connection requirement 
of Figure j2j with sources S = ^S'^^] ' 7^ ^ 2-^| and edge messages Vjp\ By the Fano 
inequality, the entropy of any source s E S conditioned on the edge variables incident to any 
node in V^s) can be made as small as desired by increasing n. Following a similar procedure 
as in the proof for Theorem [Tj it can be proved that for any non-empty subset (J} ^ a C J\f, 

In other words, h is almost entropic. ■ 

^By the Bolzano-Wierstrass Theorem which says that any sequence in a closed and bounded interval has a convergent 
subsequence, we can safely assume that limfe^oo 7lk)^i^[a]' '^p'^^) exists for any nonempty subsets a,/3 of Af. 
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C. Second Duality: Linear group character izable functions and linear network codes 

The first duality sliows tliat h is quasi-uniform (almost entropic) if and only if T{h) is 
admissible (achievable). We will now prove a similar result, restricting the network codes to 
be linear. 

Theorem 3: Let h E HlM] for J\f = {1,2, .. . , A^}. The induced rate-capacity tuple T{h) is 
admissible using linear network codes on the network Q"^ and connection requirement M\ if 
and only if h is linear group characterizable, i.e., 

h G ri(^) ^ J{h) e T°(,) 

Proof: [Proof: only-if part of Theorem |3| The proof of the only-if part is very similar to 
the one given in Theorem Tj Suppose that T{h) G ^L{q)' admissible using a linear 

network code $ on the network Q'' and connection requirement M'^ . By Proposition [2| the set of 
induced source and link random variables by $ is linear group characterizable. Using the same 
argument as in the proof for Theorem [T] /i is the entropy function of a subset of these linear 
group characterizable random variables. Hence, h is linear group characterizable. 

In fact, using the same argument, we can show that if the induced rate-capacity tuple T{h) 
is admissible using abelian network codes on the network Q"^ and connection requirement M\ 
then h is abelian group characterizable. ■ 

Before we prove the if part of Theorem [3} we need the following lemma which serves a similar 
role as Lemma [2] in the proof of Theorem [T] by justifying the feasibility of certain "compression" 
scheme. 

Lemma 9: Consider a special case of the network depicted in Figure [1] where the left node 
receives Ti(a) and T2(a) as inputs, where Ti and T2 are two linear functions defined on a vector 
space A over ¥g. Let the kernels of Ti and T2 be respectively Bi and B2. Then, there exists a 
linear function W of Ti(a) and T2(a) such that (1) Ti(a) is uniquely determined from W and 
T2(a), and (2) W takes at most gdimBa-dimBinBa different values. 

Proof: From Bi and B2, we can construct three subspaces Wi, W2 and Wq such that 

dim Wo -l- dim Wi -f- dim W2 + dimBi fl B2 = dim A 

and that for each i = 1,2, the subspace Bj is equal to the linear span of Wj and Bi H B2. Hence 
any a G A can be written uniquely as a = + ai + a2 + b where G Wj for i = 1, 2, 3 and 
b G Bi nBs. 
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Since ker(Ti) = Bi, we have Ti(ao + ai + 02 + = ^i('^2) + Ti{b). Furthermore, one can 
easily construct a linear function Tj" such that T*(Ti{a)) = (02, b). Similarly, there exists a linear 
function T; such that T;{T2{a)) = (ai,6). 

To compute Ti(a) at node 2, it suffices to compute 02 as b can be computed directly from 
T2(a). A simple counting argument shows that 02 lies in a vector subspace of dimension 
dimB2 — dimBi fl B2. Therefore, we can set W = a2 over the network and it takes at most 
^dimB2-dimBinB2 different values. ■ 

Now we may continue our proof for Theorem [3j Proof: [Proof: if part of Theorem |3| To 
prove the direct part of Theorem |3| we need to show that if h is linear group characterizable, 
then one can construct a linear network code (defined by the induced source and link random 
variables) meeting the connection requirement subject to the individual capacity constraint on 
each link. 

Suppose that h is linear group characterizable by a vector space V and its subspaces Vi, . . . , V^v, 
defined over a field ¥g. Assume without loss of generality that the subspaces intersect only at 
the zero vector, fljLi = {0}- As such, h(J\f) = \ogq ■ (dim V) and for any a C J\f, we have 
h{a) = logg ■ (dimV - dim^^.^^Vj). 

For j = 1, . . . , N, construct linear functions fj over V such that ker(/j) = Vj. The source 
random variable S[j\f] is uniformly distributed over V such that the link symbols transmitted in 



Figure 2(a) are Vj = fj{S[j^]). For any other 7^ a C A/", define S[a] to be a random variable, 
uniformly distributed over a vector space of dimension logg2 • h(a) (hence, H(S[a]) = h(a)). 
All these source random variables are assumed to be independent. 

Up to this point, we have described how source and link random variables are defined in 



Figure 2(a) It remains to show that we can construct a linear network code, consisting of a set 
of link random variables which are linear functions of the incident source/link random variables, 
satisfying the capacity constraints, and which allow each receiver to reconstruct the requested 
messages perfectly. 

For type subnetworks, all receivers can reconstruct their requested message simply by having 
the source transmit the uncoded message, W = S[a] ■ Clearly, the associated link random variables 
in these subnetworks are linear functions of the incident ones and meet the capacity constraint. 

For type 1 subnetworks, let W = {Vi : i E a) = {fi{S[j\f]) : i E a), which depends linearly 
on S[j^T^. Note that (/i(a) : i G a) = if and only if /j(a) = for alH G a, or equivalently, 
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when a G Hiea^*- '•^^ rank-nullity theorem, W can take at most |V|/| nieo^jl different 
values. We can thus treat W as a vector in space of dimension dimV — dimf^.^^ Vj. 

As a result, the subnetwork can now be treated as a special case of Lemma|9]such that Ti(a) = 
a and T2(a) = (/j(a) : i E a). The dimensions of the kernels of Ti and T2 are respectively 
and dimf^-g^ Vj. By Lemma [9| the required rate is thus logg ■ (dimf^-g^ Vj) = h{Af) — h{a). 

Similarly, for type 2 subnetworks, let W** = {fi{Sij^]) : i e a). As before, we can treat W** 
as a vector of length dimV — dimf^^^^ Vj. Similarly, S[a] can also be regarded as a vector of 
the same length. We can therefore define W by vector addition, W = S[a] + W**. Consequently, 
the receiver in the upper branch can reconstruct Va by subtracting S[a] from W. As before, one 
can find W as a linear function of S[_\f] and this function allows S[j>/] to be reconstructed from 
W and Vc,. 

For the lower branch, we can identify a special case of Figure [T] with Ti(a) = Va and 
T2(a) = Vi. One can construct W" such that {\) W" is a linear function of Ti(a) and T2(a), (2) 
the kernel ker(Ti) = fljea^i ker(T2) = V^, and (3) the rate required is dimf^^g^Vj — 
dim Vj f]^g^ Vj. Therefore, we can reconstruct Va from W" and T2(a) where Ti(a) = Va- 
Again, treating Va as a vector of length dim V — dim Hiea the receiver at the lower branch 
can reconstruct S[a] by subtracting Va from W. ■ 

So far, we have proved that h is linear group characterizable if and only if the rate-capacity 
tuple T{h) is admissible with a linear network code. As before, we can further generalize the 
result to include the case when h is almost linear group characterizable according to the following 
definition. 

Definition 17: A polymatroid h is called almost linear group characterizable if there exists a 
sequence of linear group characterizable entropy functions /i^^^ and positive constants r{k) such 
that limfc_oo h'^^^ /r{k) = h. 

It is easy to prove that the set of all almost linear group characterizable polymatroids is con(r2(^)), 
the minimal closed and convex cone containing ^^gy 

Theorem 4: Let h E 7i[A/] for J\f = {1,2,..., A^} and let T(h) be an induced rate-capacity 
tuple. Then we have 

e con(ri(,)) ^ T(/i)GT-(,) ^ T{h)ETl^gy 
In other words, the rate-capacity tuple T(/i) is asymptotically admissible (or achievable) by linear 
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network codes on the network Q"^ and connection requirement M^^ if and only if h is is almost 
linear group characterizable. 

Proof: Suppose that h E con(r2^^j). By Definition 17 one can construct a sequence 



of linear group characterizable entropy functions h^''^ and positive constants r{k) such that 
limfc^oo ^^^V'^(^) = By Theorem pi each T(/i*^")) is admissible by linear network codes. 



By property P2 the set T^^^^ of asymptotically admissible rate-capacity tuples is a closed and 
convex cone and hence T(h) E '^Tiq)- 

Clearly, T{h) E implies that T{h) E It remains to prove that T{h) E implies 

h E con(r2(g)). 

Suppose that T{h) is achievable by linear network codes. Then one can construct a sequence of 
normalizing constants r{n) and linear network codes with source messages (Sj^^j^a C J\f) 
and edge messages {vj''\j E J\f) such that 



1 



lim Pe ($^"^) = 0. (21) 

Similar to the proof given in Theorem |2} it can be proved that for any non-empty subset 7^ a C 
Af, limfc^oo (v^''^^ = h{a). In addition, as {vl''\j E Af) is linear group characterizable, 

h is almost linear group characterizable. ■ 



D. Third Duality: Polymatroids and the LP bound 

Theorem [2] provides a duality between entropy functions and network codes, namely that a 
function h E 1-L[N] is almost entropic if and only if T(/i) is achievable on Q\ M^^ . As the set 
of almost entropic functions T* has no explicit characterization for four or more variables, the 
sets of admissible or achievable rate-capacity tuples are unknown. Therefore computable bounds 
such as the linear programming bound are of great interest. 

Let r be the set of all polymatroids. Definition [14] writes the LP bound in terms of constraints 
on pseudo- variables. The following theorem provides a direct generalization of the ideas of the 
previous sections to pseudo- variables. 
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Theorem 5: Suppose h E 7Y[A/]. A rate-capacity tuple (\{h),ij{h)) satisfies the LP bound if 
and only if /i is a polymatroid, 

heV ^ J{h) G Tlp. 
Proof: The "only if" part of the proof is a direct generalization of the proof of Theorem [T] 



Suppose {X{h),uj{h)) satisfies the LP bound. By Definition 14 there exists a set of pseudo- 
variables satisfying the set of (in)equalities in ([5]). In particular, there are pseudo- variables 
{S'[a], 7^ a C A/"} and Vat such that 

H{S[a])>h{a), a CAT, (22) 
H{Si^]:aC^) = Y^H{Si^]) (23) 

aCAf 

H{Vi) < h{i). (24) 

Following the same steps as in the proof for Theorem [1] (translating random variables to pseudo- 
variables), shows that h is the pseudo-entropy function of V_^. Hence, /i is a polymatroid. 

To prove the direct part, suppose /i is a polymatroid over the ground set £ = {Vi, V2, . . . , Vat} 
(i.e. h is the pseudo-entropy function of Vx). We must exhibit a set of pseudo-variables satisfying 
the set of (in)equalities ([5]). Whereas the proof for Theorem [T] constructs auxiliary random 
variables via data compression, we need to show how to analogously adhere auxiliary pseudo- 
variables W, W" etc. to the set of pseudo-variables Vj^. In contrast to random variables, we 
cannot rely on coding theorems, or other probabilistic constructions that assume the existence 
of an underlying probability distribution. Nevertheless, it is possible to adhere pseudo-variables. 
This is accomplished in Appendix |I| where proof of the direct part is also completed. ■ 

E. Fourth Duality: Ingleton polymatroids and the LP bound for linear codes? 

Finally, we can consider rate-capacity tuples which satisfy the LP-Ingleton bound of Definition 



15 The following theorem establishes a relation to Ingleton polymatroids (i.e., a polymatroid 
satisfying Ingleton inequalities). This is shown in one direction only. Let T^pj be the set of all 
Ingleton polymatroids. 

Theorem 6: Suppose h E 7i[A/]. If a rate-capacity tuple (\(h),uj(h)) satisfies the LP bound 
for linear codes, then h is an Ingleton polymatroid, i.e., 

T{h) e Tlpj ^he Tlpj. 
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Proof: Suppose {\{h),uj{h)) satisfies the LP-Ingleton bound. By Definition 15 there exists 
a set of Ingleton pseudo-variables satisfying the set of (in)equalities in ([5]). In particular, there 
are pseudo-variables {Sia], 7^ a C J\f} and VV such that 

H{S[^])>h{a), a CAT, (25) 

H{S[^]:aCAr) = J2HiSi^]) (26) 

aCAf 

H{Vi) < hit). (21) 

Following the same steps as in the proof for Theorem [T] (translating random variables to 
pseudo-variables), shows that h is the pseudo-entropy function of Vf/. Hence, h is an Ingleton 
polymatroid. ■ 
We conjecture that the converse of the fourth duality should also hold. In fact, it can be proved 
that if the converse fails to hold, then there exists a polymatroid satisfying Ingleton inequalities 
but which is not almost linear group characterizable. Therefore determination of whether the 
converse of the fourth duality holds is a very interesting open question. 



V. Implications 



The results of Section IV while interesting in their own right, have several consequential 



applications. First, in Section V-A we consider implications to the determination of the network 
coding capacity region (in the absence of any restriction on the class of network codes). Secondly, 



we discuss the sub-optimality of hnear network codes in Section V-B 



A. The capacity region 

Implication 1 (Hardness of a multicast problem): Determination of the set of achievable source 
rate-link capacity tuples T*" is at least as hard as the problem of determining the set of all almost 
entropic functions. 

Similarly, determination of the set of source rate-link capacity tuples achieved by linear 
network codes '^^^-^ is at least as hard as the problem of determining the set of all almost 
hnear group characterizable entropy functions. 

Proof: By Theorem |2} a polymatroid h is almost entropic (and almost linear group char- 
acterizable) if and only if the induced rate-capacity tuple (\{h),tu{h)) is achievable (with linear 
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network codes). In other words, the problem of determining the set of all almost entropic (and 
almost linear group characterizable) functions can be reduced to the solubility of a corresponding 
multicast problem. ■ 
In [24], a network, called the Vamos network, was constructed from the Vamos matroid. This 
was later used to prove that the LP bound is not tight and the bound can be tightened by applying 
a non-Shannon information inequality proved in [2]. 



In the following, we will use the duality results obtained in Section IV to provide another 
proof for the looseness of LP bound. 

Implication 2 (Looseness of LP bound): The LP outer bound can be tightened by any non- 
Shannon information inequality. 

Proof: Theorem [5] shows that the rate-capacity tuple {\{h),uj{h)) is in the LP bound if 
/i is a polymatroid. Yet, Theorem [2] proves that {\(h),uj{h)) is achievable if and only if h is 
almost entropic. Consider the function h defined as follows [2]: 

h{l) = h{2) = h{3) = (4) = 2a > 
h{l,2) = 3a 
h{3,4) = 4a 

h{l,3) = /i(l,4) = h{2,3) = /i(2,4) = 3a 
h{i,j, k) = Aa = h{l, 2, 3, 4), V distinct k. 

It can be verified directly that h eV^. However, the non-Shannon information inequality obtained 
in [2] shows that h ^ V\. While the rate-capacity tuple T(/i) satisfies the LP bound, it is not 
achievable, as it is not almost entropic. ■ 
Using the same argument, any non-Shannon information inequality [2], [9], [10] will remove 
some polymatroids which are not almost entropic. The corresponding tuples in the LP bound 
will not be achievable. In other words, any set of non-Shannon information inequalities can be 
used to tighten the LP bound. 

In fact, together with the fact that f * is not a polyhedron when the number of random variables 
is at least four [10], our duality results lead to very interesting consequences. 

First, we show that the set of achievable rate-capacity tuples is not a polyhedron in general. 
Second, the LP bound is not only loose, but it remains loose even when tightened via application 
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of any finite number of linear non-Sliannon information inequalities. 

Proposition 5: The set of almost entropic functions is not a polytope. 

Proof: [Proof sketch] The following is a sketch of the proof given by Matus [10]. Matus 
constructed a convergent sequence of entropic functions gt go with one-side tangent go^ = 
\imt^Q+{gt — go)/t. Clearly, if f* is polyhedral, there exists e > such that go + ego+ G f*. This 
was shown not to be the case, since go+ego+ violates some of the information inequalities proved 
in [10]. Therefore, F* is not polyhedral. Furthermore, there are infinitely many information 
inequalities. ■ 

Implication 3 (Set of achievable rate-capacity tuples): The sets of achievable rate-capacity 
tuples and for the network and connection requirement M^^ are not polytopes (when 
N > 4). 

Proof: Consider the sequence gt go from the proof of Proposition [5] By Theorem |2} 
T{gt) and T{go) are asymptotically admissible. As T(/i) is a linear function of h, we have 

f 4 lim{J{gt) - J{go))/t = T(^o+). (28) 

For any e > 0, 

T((7o) + er = T((?o + e^o+). (29) 

As go + ego+ is not almost entropic, T{go) + is not achievable. In other words, T°° and T*^ 
are not polytope. ■ 
Now the LP bound is a polytope, while the capacity region is not. Furthermore, the introduction 
of any finite number of additional linear inequalities in the LP bound simply results in another 
polytope. Hence 

Implication 4 (Looseness of polyhedral bounds): The LP bound is not tight. Furthermore, any 
finite number of linear information inequalities cannot tighten the LP bound Tip to the set of 
achievable rate-capacity tuples T^ In fact, any polyhedral outer bound for T*" is not tight. 

Proof: A direct consequence of Theorem |3] and Proposition |5j ■ 

B. Suboptimality of linear network codes 

As discussed in Section |II-A[ it may be practically desirable to use network codes with nice 
algebraic properties that simplify encoding and decoding operations. Most algebraic network 
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codes considered in the literature are linear, and these were shown in [16] to be optimal for 
single session multicast. 

Since the appearance of [16], it has been an open question as to whether linear network 
codes are in general optimal. This question was recently answered in the negative by Dougherty 
et. al [20]. Their proof constructs a special network containing two subnetworks such that the 
base fields required for optimality by each of the subnetworks have different characteristics, 
establishing a contradiction. 

The following provides an alternative proof using a completely different approach, making 



use of the duality between entropy functions and achievability established in Section IV The 
proof is an immediate consequence of the duality results and that some entropic functions are 
not almost linear group characterizable. 

Implication 5 (Suboptimality of linear network codes): There is a network and a connection 
requirement such that the use of abelian network codes is suboptimal, including linear network 
codes, _R-module codes, and time-sharing of such. 

Proof: Consider a set of four random variables f/i, f/2, f/3, f/4 constructed using the pro- 
jective plane described in [2]. The entropy function of these random variables is 

hiX) = h{2) = h{3) = (4) = log 13 
/i(l,2) = log 6 + log 13 
/i(3, 4) = log 13 + log 12 

h{l, 3) = /i(l, 4) = /i(2, 3) = h{2, 4) = log 13 + log4 
h{i,j,k) = log 13 + log 12 = /i(l,2,3,4), V distinct i,j,k. 

Since h is the entropy function of a set of random variables, T{h) is achievable, by Theorem |2] 
Since h does not satisfy the Ingleton inequality 

h{l, 2) + h{l, 3) + /i(l, 4) + h{2, 3) + h{2, 4) > 

h{l) + h{2) + /i(3, 4) + /i(l, 2, 3) + /i(l, 2, 4), (30) 



h is not almost linear group characterizable. By Theorem |4| T(/i) is not achievable by linear 
network codes. ■ 
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Implication 6 (Suboptimality of abelian group network codes): There is a network and a mul- 
ticast requirement for which abelian codes are (asymptotically) suboptimal. 

Proof: All abelian group characterizable entropy function must satisfy the Ingleton inequal- 
ity. The corollary then follows. ■ 

VL Conclusion 

Entropy functions and network coding are already closely connected, through the network 
coding capacity region which is expressed in terms of V* . The main results of this paper, 
summarized in Figure |4} further strengthens this connection. Figure |4] shows the inclusion 
relationships of the various sets of interest, as well as the implications between set membership 
of h and T(/i) established by the theorems. Each arrow is labeled by the Theorem number which 
establishes the relation. Note that the relation of conlV*^^^^^) to sets other than T^j-^^ shown in 



Figure 4(a) is unknown, hence the linear code relationships are shown separately in Figure 4(b) 



ri(,)C r:^ c c c r* c r* c r 



L{q) 

3 



L{q) L{q) ^^-,1 



(a) 



(b) Linear codes. 



Fig. 4. Summary of the duality results. 



Given a non-negative real function g whose domain consists of all non-empty subsets of N 
random variables, we have provided a construction for a network and a connection requirement 
such that a rate-capacity tuple is achievable if and only if g is almost entropic (i.e. satisfies every 
information inequality). The network topology depends only on the number of random variables, 
and not on the function g, which affects the construction only through the assignment of source 
rates and link capacities. 

An extension of this result shows that a rate-capacity tuple for the constructed multicast 
problem is achievable by linear network codes if and only if the entropy function g is almost 
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linear group characterizable. A further extension shows that the induced rate-capacity tuple 
satisfies the linear programming bound if and only if the function is a polymatroid (i.e. satisfies 
all Shannon-type inequalities). This extension is obtained using the concept of pseudo-variables, 
which replace random variables in the domain of g. These pseudo-variables are abstract objects 
that do not take any values, and are not associated with any probabiUty distribution. The key is 
that polymatroids defined over set of pseudo-variables behave very similar to entropy functions, 
except that they lie in F rather than V* . This definition of pseudo-variables is not just a matter 
of terminology. It is a non-trivial matter to generalize notions of extension and adhesion of 
random variables (which rely on the existence of a probability distribution) to pseudo-variables. 
We provided some examples of such extensions and adhesions, which leaves the proof of the 
main theorem intact under a substitution of pseudo- variables for random variables. We anticipate 
that this concept of pseudo-variables, and their differences from random variables, may yet bear 
more fruit in uncovering the structure of V* 

The seemingly simple duality between entropy vectors and network codes has a number of 
powerful implications. It renders the problems of network code solubility is at least as hard 
as determination of F*. We also obtain alternate proofs that the LP bound is not tight, and 
that non-Shannon inequalities such as the Zhang- Yeung inequality indeed tighten the LP bound. 
However no additional finite number of inequalities can improve the LP bound to the capacity 
region. Finally, we have proved the suboptimality of abelian network codes, including linear 
codes, i?-module codes and any scheme that time-shares between such codes. The duality result 
also provides a tool to compare different classes of network codes. Rather than comparing the 
codes directly, one can now compare the sets of entropy functions induced by the codes. 
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Appendix I 
Proof for Converse of Theorem[5] 

Before we prove the direct part of Theorem [5| we will prove some intermediate results which 
show how to extend sets of pseudo-variables (build new pseudo-variables from old ones), and 
how to adhere additional pseudo-variables to a given set of pseudo-variables (consistently join 



two sets of pseudo-variables). These results are provided in Section I-A The proof of Theorem 
ID follows in Section iTBl 

A. Adhesion and extension for pseudo -variables 

For random variables, adhesion or extension is facilitated by the existence of an underlying 
probability distribution. For example, consider two sets of random variables C = {X, U } and 
C* = {X, W} with respective underlying distributions Pxu and Pxw Suppose that the marginals 
over X coincide, Px = Px- We can then easily adhere Pxu and Pxw obtain a new distribution 
Qxuw such that its marginals over C and C* coincide, Qxu = Pxu and Qxw = Pxw O'^^ 
possibility is Qxuv = Pxu Pxw /P^- In general, for any sets of random variables C and C* 
with respective distributions P and P* coinciding on £n£*, we can construct a new distribution 
over Cue* such that its marginals over C and C* are P and P* . Clearly, the entropy function 
for CU C* is an extension of those belonging to C and C*. 

Consider another simple example. Let A C Che a subset of the random variables C. Then we 
can define a new random variable W = A. By doing so, we have constructed a new variable, 
and extended both the distribution and entropy function. Clearly there are various ways to adhere 
or extend sets of random variables. Doing this for pseudo-variables is not so straightforward. 
The following results provide several adhesion and extension methods for pseudo-variables. 

Lemma 10 (Functional extension): Let £ be a set of pseudo-variables. For any given A C, 
one can adhere a new pseudo-variable F to £ such that H(Y\A) = H(A\Y) = 0. In other 
words, there exists a polymatroid g over CU {Y} satisfying 

g{B) = H{B) ^BCC (31) 

g{Y)=giA)=g{{Y}UA). (32) 
Proof: Define g over C U {Y} such that for all B C C, 

g{B) = H{B) and g{{Y} U B) = H{B U A). (33) 
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It is straightforward to show that g is a polymatroid satisfying (|3T]) and ([32]) 



In light of Definition \12\ we shall refer to ( 33 1 as functional extension and denote the new 



variable as J4. Clearly, any subset of pseudo-variables in ^ is a function of J4. 

Lemma 11 (Sum extension): Let {X, Y} be a set of pseudo-variables such that H(X) = H(Y) 
and X LY . Then one can adhere a new pseudo-variable Z to {X, Y} such that H{Z) = H{X) 
and H{Z\X, Y) = H{X\Y, Z) = H{Y\X, Z) = 0. 

Proof: Let g be the pseudo-entropy function for {X, Y}. Extend g such that g{Z) = g{X) 
and g{X, Z) = g(Y, Z) = g{X, Y, Z) = g{X, Y). The resulting extended g is still a polymatroid. 



Lemma 1 1 shows that for any independent pseudo-variables X and Y of equal pseudo-entropies, 
one can construct a pseudo-variable Z, denoted Z = X (BY such that its pseudo-entropy is the 
same as X and Y, and any single pseudo-variable is a function of the two others. Structurally, 
this mimics the modulo-2 addition of two i.i.d binary random variables. 

Lemma 12 (SW extension): Let {X, Y} be two pseudo-variables. Then one can adhere a new 
pseudo-variable Z to {X, Y} such that 

HiZ) = H{X\Y), 
H{X\Z,Y) = 0, 
H{Z\X) = 0. 

Proof: Let g be the pseudo-entropy of {X, Y} and extend it as follows: g{Z) = g{X, Y) — 
g(Y), g{Z,Y) = g{X,Y,Z) = g{X,Y), and g{X,Z) = g{X). The resulting extended g is still 
a polymatroid. ■ 



Lemma 12 shows that starting with pseudo-variables X,Y, one can construct another pseudo- 
variable Z with pseudo-entropy H(X, Y) — HiY) such that X is a function of F, Z and Z is a 
function of X. For simplicity, we use the symbol Jx\y to denote the new pseudo-variable Z. 



Lemmas 10 -12 show that sets of pseudo-variables can be explicitly extended to obtain new 
pseudo- variables. In the following, we study adhesion of existing sets of pseudo-variables. 

Lemma 13 (Independent adhesion): Let C and C* be two disjoint sets of pseudo-variables. 
Then they can adhere to each other independently such that for any CVJ C* , 

H{A) = H{AnC) + H{AnC*). (34) 
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Proof: Let g and g* be the pseudo-entropies of A and A*, and for each ^ C £ U £* set 
5f(^) = g{A n £) + 5f*(^ n £*). It can be verified that is a polymatroid. ■ 
Any subsets ^ C £ and i3 C £* are independent, A -LB under the independent adhesion of 



C and C* in Lemma 13 Before we continue with more complicated adhesions, we need the 
following proposition from [31]. 

Proposition 6: Let C and C* be two sets of pseudo-variables coinciding over C = C f] C*, 
i.e. for all A C C\ the pseudo-entropy of A is the same with respect C and £*. Further, suppose 

A{A,B)>A{C'nA,C'rM3), (35) 

for all flatQ A,BofC where A(^, B) = H{A) + H{B) - H{A UB)- H{A n B). Then £ and 
C* can adhere to each other. 

Proof: See Theorem 1 in [31]. ■ 
Corollary 1: Let C = {X,Y,Z} be a set of pseudo-variables, such that Z is a function of 
X, Y and X is a function of Y, Z. Let C* be another set of pseudo- variables such that C and 
C* coincide over Cf]C* = {X, Y}. Then C* and C can adhere to each other. 

Proof: It is easy to verify that {X, Y} and {Y, Z} cannot be flats of C To prove the 



corollary, it suffices to prove that (35) is satisfied for all flats of C 



Suppose that A and B are flats of C. If either ^ or i3 is the empty set, {Z} or {X, Y, Z}, 



then either £' n ^ C £' n i3 or £' n i3 C £' n ^. As a result, A{C' f] A, C f] B) = and (35 1 



holds. On the other hand, if both A and B axe subsets of {X, F}, then it is obvious that (35) 



remains true. Now, suppose A = {X,Z}. Then (35) holds for B = {X} or {X,Z}. Finally, 
when A = {X, Z} and B = {Y}, by direct verification, (35 ) still holds. Combining all the cases, 
we see that ([35]) indeed holds for all flats of C. ■ 
Corollary [T] directly leads to the following result. 

Theorem 7: Let C* 3 {X, Y}. Then one can adhere the pseudo- variable Z = Jx\y to C*. 

If in addition H{X) = H(Y), it is possible adhere a pseudo-variable Z = X (BY to C*. 

B. Proof for direct part of Theorem |5] 

Proof: To prove the direct part, we must exhibit a set of pseudo-variables satisfying the 
set of (in)equalities (|5]). Our construction works as follows: 

''a subset A of the ground set C is a. flat if H[A') > H{A) for all proper supersets A' containing A. 
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• Let Vi, . . . ,V]\i be pseudo-variables whose pseudo-entropy function is h. 

A 



• By Lemma 10 we can adhere S[j^] = Jc to C = {Vi, . . . , 

• For any non-empty subset a of J\f, let S[jKf] be a pseudo-variable whose pseudo-entropy is 

• By Lemma [T3| we adhere independent pseudo-variables S^a] to the current set of pseudo- 
variables {Vi, ...,Vn, S[^f]}. 

• By Theorem |7| we can further adhere auxiliary pseudo-variables such as Jy^, Jsij^^iJy^^ 
Jv^ © S[a] etc. 

Now, we will show how to associate pseudo-variables to edges. If the edge is uncapacitated, 
then the associated pseudo-variable is the join of the set of pseudo-variables incident to that edge. 
It remains to show that for the three subnetworks, we can adhere pseudo-variables meeting all 
the constraints of the LP bound. 

Consider type subnetworks. Let W = S[a]. Then, ([5]) clearly holds. In type 1 subnetworks 
let W = Js[f^pvc, ^'^^ ^ '^Vc- Again, (|5j) holds. Finally, for type 2 subnetworks, let W = 
S[c,] © Jv^, W = Js^j^^yjv^, W" = Jjyjv., and W* = W** = Jy„. By direct verification, the set 
of (in)equalities (|5]) holds. ■ 
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